Restructuring View Maintenance Plans for Large Update

نویسندگان

  • Bin Liu
  • Elke A. Rundensteiner
  • David Finkel
چکیده

Materialized views defined over distributed data sources are a well recognized technology in data integration, e-business and semantic web. Due to the constantly increasing size of the information sources and the rapid rates of change, there comes an increasing pressure to reduce the time taken for refreshing such integration views. State-of-the-art incremental view maintenance literature requires O(n) (batch view maintenance) or more (i.e., sequential maintenance) maintenance queries with n is the number of data sources. In this work, we optimize the maintenance performance by restructuring the batch view maintenance plan to reduce the number of maintenance queries to remote data sources when maintaining a large set of updates. We first propose an adjacent grouping strategy which exploits the regularity in the batch maintenance plan. This solution reduces the number maintenance queries by sharing the common accesses to data sources. Then we propose a conditional grouping approach which reduces the number of remote queries to O(n) by unifying heterogeneous subexpressions (deltas). A cost model to analyze these approaches is provided. The proposed maintenance strategies have been implemented in our TxnWrap system. Experimental studies illustrate that our conditional grouping algorithm has about 300% performance improvement in terms of total processing time compared with existing batch algorithms in a major part of cases. Our experiments also reveal an additional dimension of this design space, namely the impact of the cooperation of the remote sources on maintenance performance.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Incremental Maintenance of Schema-Restructuring Views

An important issue in data integration is the integration of semantically equivalent but schematically heterogeneous data sources. Declarative mechanisms supporting powerful source restructuring for such databases have been proposed in the literature, such as the SQL extension SchemaSQL. However, the issue of incremental maintenance of views defined in such languages remains an open problem. We...

متن کامل

Maintaining large update batches by restructuring and grouping

Materialized views defined over distributed data sources can be utilized by many applications to ensure better access, reliable performance, and high availability. Technology for maintaining materialized views is thus critical for providing upto-date results since a stale view extent may not help or even mislead these applications. State-of-the-art incremental view maintenance requires OðnÞ or ...

متن کامل

افزایش سرعت نگهداری افزایشی دید با استفاده از الگوریتم فاخته

Data warehouse is a repository of integrated data that is collected from various sources. Data warehouse has a capability of maintaining data from various sources in its view form. So, the view should be maintained and updated during changes of sources. Since the increase in updates may cause costly overhead, it is necessary to update views with high accuracy. Optimal Delta Evaluation method is...

متن کامل

Periodic flexible maintenance planning in a single-machine production environment

Preventive maintenance is the essential part of many maintenance plans. From the production point of view, the flexibility of the maintenance intervals enhances the manufacturing efficiency. On the contrary, the maintenance departments tend to know the timing of the long term maintenance plans as certain as possible. In a single-machine production environment, this paper proposes a simulation–o...

متن کامل

View Maintenance in Web Data Platforms

Modern Web Data Platforms (WDPs) handle large amount of data and activity through massively distributed infrastructures. To achieve performance and availability at Internet scale, WDPs restrict querying capability, and provide weaker consistency guarantees than traditional ACID transactions. The sheer volume of parallel processing without ACID transaction guarantees, and the large number of ind...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2003